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Introduction 

AISL  (Assured  Information  Sharing  Lifecycle)  is  a  MURI  project  that  is  developing  new  approaches  to  support  assured 
information  sharing.  The  team  includes  researchers  from  UMBC,  Purdue  University,  and  the  Universities  of  Illinois, 
Michigan,  Texas  at  Dallas,  and  Texas  at  San  Antonio.  Our  research  efforts  were  organized  around  four  areas:  (1)  creat¬ 
ing  novel  assured  information  sharing  models  and  frameworks  and  new  policy  languages  and  systems  that  support 
them,  (2)  developing  algorithms  and  systems  for  information  integration,  analysis  and  mining  that  assure  quality  and 
protect  privacy,  (3)  analyzing  social  aspects  of  information  sharing  including  incorporating  incentives  for  sharing  and 
exploiting  knowledge  of  underlying  social  networks  and  relations,  and  (4)  implementing  and  evaluating  experimental 
software  architectures  and  systems  to  realize  the  assured  information  sharing  life  cycle.  Our  fourth  year  was  productive 
with  significant  results  achieved  across  all  areas  of  the  project.  We  briefly  describe  some  of  these  results  below  and  list 
selected  recent  publications. 

Purdue  University 

Bertino  and  collaborators  have  developed  the  foundations  for  a  new  access  control  model  that  overcomes  drawbacks  of 
well-known  existing  access  control  models  (namely  RBAC  and  XACML).  The  language  associated  with  the  proposed 
model,  called  extensible  Functional  Language  for  Access  Control  (xfACL),  is  based  on  a  functional  notation.  Dr.  Clif¬ 
ton  along  with  student  Ahmet  Erhan  Nergiz  have  been  working  on  techniques  for  processing  partially  encrypted  data¬ 
bases,  ensuring  ability  to  share  data  that  can  be  shared  while  protecting  data  that  must  be  protected.  Instead  of  separat¬ 
ing  disclosable  and  non-disclosable  data  (and  thus  potentially  losing  linkages  between  them),  this  technology  will  ena¬ 
ble  a  single  database  that  to  the  view  of  an  appropriately  authorized  user  shows  the  entire  data,  but  to  a  lower  authority 
user  (including  the  server  itself)  shows  only  a  limited  view  of  the  data.  Clifton  also  worked  with  team  member  Murat 
Kantarcioglu  at  the  University  of  Texas  at  Dallas  on  learning  from  data  where  adversaries  are  actively  altering  the  data 
to  ’’hide  their  tracks”. 

University  of  Illinois 

The  UIUC  Team  continued  research  on  the  theme  of  integrating,  mining  and  assessing  the  quality  of  shared  infor¬ 
mation.  To  this  end,  they  have  published  over  200  papers  in  major  conferences  and  journals  and  developed  new  models 
and  algorithms  for  a  number  of  key  problems  such  as  trust  modeling  and  propagation,  opinion  integration  and  summari¬ 
zation,  and  latent  topic  structure  mining.  The  following  are  a  few  highlights  of  their  recent  accomplishments. 

Assessing  trustworthiness  of  text  data  is  very  important  for  the  information  management  component  of  their  whole 
project,  but  it  poses  special  challenges  due  to  the  difficulty  in  natural  language  understanding.  To  solve  this  problem, 
they  extended  a  trust  framework  that  they  developed  previously  (i.e.,  TruthFinder)  and  they  proposed  a  more  general 
new  unified  trust  propagation  framework  to  compute  trustworthiness  scores  for  sources  and  textual  claims  in  presence 
of  quality  measurements  for  evidence  provided  by  humans.  They  instantiated  the  framework  to  model  trustworthiness  of 
news  and  predict  if  coverage  on  a  given  topic  is  trustworthy.  Evaluation  results  show  that  the  proposed  new  framework 
is  effective  for  modeling  trust  of  text  data. 

Integration  and  summarization  of  scattered  opinions  collected  from  multiple  sources  in  an  information  sharing 
framework  are  critical  to  enable  effective  use  of  the  shared  information.  They  have  developed  a  number  of  general  tech¬ 
niques  for  integrating  scattered  opinions  and  summarizing  them.  First,  they  developed  a  general  strategy  for  leveraging 
online  ontology  to  organize  scattered  opinions  into  meaningful  aspects.  Second,  they  proposed  a  novel  summarization 
algorithm  for  generating  concise,  abstractive  summaries  of  redundant  opinions  Third,  they  proposed  a  novel  opinion 
analysis  problem  called  Latent  Aspect  Rating  Analysis  and  developed  two  probabilistic  models  for  solving  this  prob¬ 
lem,  which  enable  detailed  analysis  of  opinions  expressed  in  vast  amount  of  online  reviews  by  decomposing  overall 
ratings  into  ratings  on  each  specific  aspect  and  inferring  relative  weights  placed  by  reviewers  on  different  aspects. 


The  Illinois  team  has  developed  a  graph-based  regularization  framework,  GNetMine,  to  model  the  link  structure  in 
heterogeneous  information  networks  with  arbitrary  network  schema  and  number  of  object/link  types.  Specifically,  they 
explicitly  differentiate  the  multi-typed  link  information  by  incorporating  it  into  different  relation  graphs.  Efficient  com¬ 
putational  schemes  are  introduced  to  solve  the  corresponding  optimization  problem.  Experiments  on  the  DBLP  data  set 
show  that  their  algorithm  significantly  improves  the  classification  accuracy  over  existing  state-of-the-art  methods  (in¬ 
cluding  both  network/graph  classification  methods  and  their  recently  developed  rank-based  clustering  methods).  Fur¬ 
thermore  they  integrate  ranking  with  heterogeneous  information  network  classification,  and  developed  a  RankClass 
algorithm,  which  iterative  refine  both  ranking  and  classification  in  information  networks  which  derives  higher  quality 
classification  model  as  well  as  good  ranking  for  each  type  of  nodes  in  heterogeneous  information  networks.  The  exper¬ 
iments  show  that  the  method  derives  even  better  quality  classification  models  than  GNetMine. 

Heterogeneous  information  networks,  i.e.,  the  logic  networks  involving  multi-typed,  interconnected  objects,  are 
ubiquitous.  It  is  necessary  to  provide  functions  for  these  networks  to  find  similar  objects,  e.g.,  similar  authors  and  pa¬ 
pers  in  a  bibliographic  network.  Unfortunately,  there  is  a  lack  of  similarity  definition  among  multi-typed  networks  and 
effective  algorithms  for  similarity  search  in  such  networks.  They  proposed  an  intuitive  meta-path-based  similarity  defi¬ 
nition:  A  user  can  specify  a  meta  path  sequence  of  relations  to  determine  similarity  scores  among  linked  objects.  Multi¬ 
ple  meta-paths  can  then  be  combined  to  address  complex  queries.  While  this  definition  is  flexible  to  represent  different 
similarity  queries,  it  requires  expensive  computations  (e.g.,  matrix  multiplications),  which  is  not  affordable  in  large- 
scale  information  networks.  Thus  they  developed  an  efficient  solution  that  partially  materializes  short  meta  path  and 
then  concatenates  them  online  to  compute  results.  The  proposed  method  could  improve  search  performance  by 
20%~300%.  Moreover,  to  further  explore  the  power  of  meta  paths,  they  study  the  selection  and  use  of  meta  paths  to 
predict  links  and  relationships  in  heterogeneous  networks  and  show  training  can  be  performed  to  select  the  critical  meta¬ 
path  to  enhance  the  predictability  in  heterogeneous  networks,  which  are  verified  in  experiments  on  the  DBLP  datasets. 

Data  cubes  play  an  essential  role  in  data  analysis  and  decision  support.  In  a  data  cube,  data  from  a  fact  table  is  ag¬ 
gregated  on  subsets  of  the  table’s  dimensions,  forming  a  collection  of  smaller  tables  called  cuboids.  When  the  fact  table 
includes  sensitive  data  such  as  salary  or  diagnosis,  publishing  even  a  subset  of  its  cuboids  may  compromise  individuals’ 
privacy.  In  this  study,  they  address  this  problem  using  differential  privacy  (DP),  which  provides  provable  privacy  guar¬ 
antees  for  individuals  by  adding  noise  to  query  answers.  They  choose  an  initial  subset  of  cuboids  to  compute  directly 
from  the  fact  table,  injecting  DP  noise  as  usual;  and  then  compute  the  remaining  cuboids  from  the  initial  set.  Given  a 
fixed  privacy  guarantee,  they  show  that  it  is  NP-hard  to  choose  the  initial  set  of  cuboids  so  that  the  maximal  noise  over 
all  published  cuboids  is  minimized,  or  so  that  the  number  of  cuboids  with  noise  below  a  given  threshold  is  maximized. 
They  provide  an  efficient  procedure  with  running  time  polynomial  in  the  number  of  cuboids  to  select  the  initial  set  of 
cuboids,  such  that  the  maximal  noise  in  all  published  cuboids  will  be  minimized.  They  also  show  how  to  enforce  con¬ 
sistency  in  the  published  cuboids  while  simultaneously  improving  their  utility  (reducing  error). 

University  of  Maryland,  Baltimore  County 

The  UMBC  team  worked  on  three  areas  during  the  past  progrsm:  developing  approaches  and  tools  to  enhance  assured 
information  sharing  in  social  networking  contexts,  using  policies  and  context  to  enforce  privacy  policies  for  information 
sharing  in  mobile  devices  such  as  smart  phones,  and  applying  policies  and  machine  learning  to  detect  malicious  nodes 
in  ad  hoc  networks. 

They  developed  an  implementation  of  he  g-SIS  group-centric  access  control  model  and  demonstrate  its  usefulness 
to  use  cases  in  information  sharing  in  social  media.  Contributions  include  the  prototype  implementation,  extension  to 
the  model  such  as  hierarchical  groups  and  necessary  and  sufficient  conditions,  and  the  use  of  the  semantic  Web  lan¬ 
guage  OWL  for  representing  the  central  g-SIS  concepts  and  associated  data.  The  framework  uses  a  pragmatic  approach 
of  using  semantic  web  technology  to  represent  and  reason  about  the  hierarchy  and  procedural  method  to  compute  access 
decisions  relying  on  the  g-SIS  semantics.  They  also  developed  a  system  that  helps  users  maintain  the  information  shar¬ 
ing  groups  typical  of  social  networking  systems  like  Facebook  and  Google+.  They  implemented  a  system  that  classifies 
a  user's  new  connections  into  one  or  more  existing  groups  based  on  the  connection's  attributes  and  relation  and  demon¬ 
strated  the  approach  using  data  collected  from  real  Facebook  users.  Another  significant  challenge  is  posed  by  hierar¬ 
chical  and  overlapping  groups.  They  showed  that  the  system  classifies  new  connections  into  these  groups  with  high  ac¬ 
curacy  even  with  only  1 0-20%  labeled  data. 

Recent  years  have  seen  a  confluence  of  two  major  trends  —  the  increase  of  mobile  devices  such  as  smart  phones  as 
the  primary  access  point  to  networked  information  and  the  rise  of  social  media  platforms  that  connect  people.  Their 
convergence  supports  the  emergence  of  a  new  class  of  context-aware  geo-social  networking  applications.  While  existing 
systems  focus  mostly  on  location,  their  work  centers  on  models  for  representing  and  reasoning  about  a  more  inclusive 
and  higher-level  notion  of  context,  including  the  user's  location  and  surroundings,  the  presence  of  other  people  and  de- 


vices,  and  the  inferred  activities  in  which  they  are  engaged.  A  key  element  of  the  work  is  the  use  of  collaborative  infor¬ 
mation  sharing  where  devices  share  and  integrate  knowledge  about  their  context.  This  introduces  the  need  for  privacy 
and  security  mechanisms.  They  developed  a  framework  to  provide  users  with  appropriate  levels  of  privacy  to  protect 
the  personal  information  their  mobile  devices  are  collecting  including  the  inferences  that  can  be  drawn  from  the  infor¬ 
mation.  They  used  Semantic  Web  technologies  to  specify  high-level,  declarative  policies  that  describe  user  information 
sharing  preferences.  They  implemented  and  evaluated  a  prototype  system  that  aggregates  information  from  a  variety  of 
sensors  on  the  phone,  online  sources,  and  sources  internal  to  the  campus  intranet,  and  infers  the  dynamic  user  context. 
The  policy  framework  can  be  effectively  used  to  devise  better  privacy  control  mechanisms  to  control  information  flow 
between  users  in  such  dynamic  mobile  systems. 

Mobile  Ad-hoc  Networks  (MANETs)  are  extremely  vulnerable  to  a  variety  of  misbehaviors  because  of  their  basic 
features,  including  lack  of  communication  infrastructure,  short  transmission  range,  and  dynamic  network  topology.  To 
detect  and  mitigate  those  misbehaviors,  trust  management  schemes  have  been  proposed  that  rely  on  pre-defmed  weights 
to  determine  how  each  apparent  misbehavior  contributes  to  an  overall  measure  of  trustworthiness.  The  extremely  dy¬ 
namic  nature  of  MANETs  makes  it  difficult,  however,  to  determine  a  set  of  weights  that  are  appropriate  for  all  contexts. 
The  UMBC  group  developed  an  automated  trust  management  scheme  for  MANETs  that  uses  machine  learning  to  clas¬ 
sify  nodes  as  malicious.  Our  scheme  is  far  more  resilient  to  the  context  changes  common  in  MANETs,  such  as  those 
due  to  malicious  nodes  altering  their  misbehavior  patterns  over  time  or  rapid  changes  in  environmental  factors,  such  as 
the  motion  speed  and  transmission  range.  The  evaluation  results  on  simulation  studies  showed  it  to  be  effective  and  to 
perform  significantly  better  than  other  approaches. 

University  of  Michigan 

At  the  University  of  Michigan,  Adamic  and  collaborators  have  conducted  empirical  and  modeling  studies  of  two  aspects 
of  assured  information  sharing.  The  first  is  determining  whether  ratings  of  information  can  themselves  be  relied  upon. 
The  second  is  discovering  and  measuring  how  multiple  propagation  steps  can  distort  information  as  it  is  transmitted. 

They  examined  the  reliability  of  human-supplied  ratings  when  individuals  are  asked  to  rate  other  individuals  or  the 
information  they  have  provided.  Last  year  they  showed  that  ratings  are  inflated  if  they  are  given  publicly,  if  they  are 
identified,  and  if  there  is  potential  for  reciprocity.  This  year,  they  followed  up  with  a  large  scale  data  analysis  of  mil¬ 
lions  of  user-to-user  ratings,  complemented  by  a  survey  of  over  500  users  of  the  website  Couchsurfmg.org,  and  18  in- 
depth  interviews.  In  order  to  understand  the  ratings,  they  revisit  the  notions  of  friendship  and  trust  and  uncover  an 
asymmetry:  close  friendship  includes  trust,  but  high  levels  of  trust  can  be  achieved  without  close  friendship.  To  users, 
providing  faceted  ratings  presents  challenges,  including  differentiating  and  quantifying  inherently  subjective  feelings 
such  as  friendship  and  trust,  concern  over  a  friend’s  reaction  to  a  rating,  and  knowledge  of  how  ratings  can  affect  oth¬ 
ers’  reputations.  One  consequence  of  these  issues  is  the  near  absence  of  negative  feedback,  even  though  a  small  portion 
of  actual  experiences  and  privately  held  ratings  are  negative.  They  show  how  users  take  this  into  account  when  formu¬ 
lating  and  interpreting  ratings,  and  discuss  designs  that  could  encourage  more  balanced  feedback. 

Information  dissemination  is  becoming  increasingly  more  distributed,  especially  with  increased  use  of  social  me¬ 
dia.  In  social  media  content  often  makes  multiple  hops,  and  consequently  has  opportunity  to  change.  In  this  paper  they 
focus  on  content  that  should  be  changing  the  least,  namely  quoted  text.  They  find  changes  to  be  frequent,  with  their  like¬ 
lihood  depending  on  the  authority  of  the  copied  source  and  the  type  of  site  that  is  copying.  They  uncover  patterns  in  the 
rate  of  appearance  of  new  variants,  their  length,  and  popularity,  and  develop  a  simple  model  that  is  able  to  capture  them. 
These  patterns  are  distinct  from  ones  produced  when  all  copies  are  made  from  the  same  source,  suggesting  that  infor¬ 
mation  is  evolving  as  it  is  being  processed  collectively  in  online  social  media. 

University  of  Texas,  San  Antonio 

To  share  information  and  retain  control  (share-but-protect)  is  a  classic  cyber  security  problem  for  which  effective  solu¬ 
tions  continue  to  be  elusive.  Where  the  patterns  of  sharing  are  well  defined  and  slow  to  change,  it  is  reasonable  to  apply 
the  traditional  access  control  models  of  lattice-based,  role-based  and  attribute-based  access  control,  along  with  discre¬ 
tionary  authorization  for  further  fine-grained  control  as  required.  This  dissemination-centric  approach  offers  considera¬ 
ble  flexibility  in  terms  of  controlling  a  particular  information  object  with  respect  to  already  defined  attributes  of  users, 
subjects  and  objects.  However,  it  has  many  of  the  same  or  similar  problems  that  discretionary  access  control  manifests 
relative  to  role-based  access  control.  In  particular  specifying  information  sharing  patterns  beyond  those  supported  by 
currently  defined  authorization  attributes  is  cumbersome  or  infeasible.  Therefore,  UTSA  researchers  have  developed 
and  formalized  a  novel  mode  of  information  sharing  called  group-centric.  Group-centric  secure  information  sharing  (g- 
SIS)  is  designed  to  be  agile  and  accommodate  ad  hoc  patterns  of  information  sharing.  A  g-SIS  theory  has  been  devel¬ 
oped  for  isolated  groups,  which  are  essentially  sinks  wherein  information  is  brought  into  a  group  (akin  to  a  secure  meet- 


ing  room)  to  be  shared  by  group  members.  New  information  is  also  developed  within  the  group.  The  UTS  A  team  is 
currently  extending  the  theory  to  connected  groups  wherein  information  in  one  group  can  be  made  accessible  to  mem¬ 
bers  of  another  group  by  various  subordination  relations. 

University  of  Texas,  Dallas 

Kantarcioglu,  Thuraisingham,  Khan  and  Bensoussan  have  worked  on  incentive  compatible  distributed  data  mining 
schemes,  economic  incentives  privacy-preserving  technology  adoption,  cloud  computing  based  tools  for  assured  infor¬ 
mation  sharing  and  social  network  privacy  issues.  The  group’s  incentive  compatible  distributed  data  mining  results  in¬ 
dicate  that  they  can  encourage  truthful  data  sharing  that  does  not  require  the  ability  to  audit  or  verify  the  data  under  co¬ 
operative  coalition  formation  scenarios.  They  prove  that  these  mechanisms  are  incentive  compatible  under  reasonable 
assumptions.  In  addition,  they  provide  extensive  experimental  data  that  shows  the  viability  of  the  mechanisms  in  prac¬ 
tice.  Our  economic  analysis  of  privacy-preserving  technology  (PPT)  adoption  indicates  that  in  many  cases  significant 
government  subsidies  are  needed  to  encourage  PTT  adoption.  For  cases  where  few  individuals  value  privacy  and  are 
extremely  profitable  to  a  firm  than  there  is  a  possibility  for  market  based  solutions.  They  developed  new  social  graph 
anonymization  scheme  to  protect  against  sensitive  value  inference  attacks.  We  are  collaborating  with  Dr.  Steve  Barker 
(he  is  funded  by  EOARD)  to  demonstrate  assured  information  sharing  using  the  secure  cloud  data  and  policy  manage¬ 
ment  system  they  have  developed  at  UTD.  This  system  was  demonstrated  at  the  September  2011  meeting  in  Washing¬ 
ton  DC. 
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